DOCUMENT RESUME 



ED 371 042 



TM 021 746 



AUTHOR 
TITLE 

PUB DATE 
NOTE 



PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Gipps, Caroline 

What Do We Mean by Equity in Relation to 

AssGSsment? 

Apr 94 

14p.; Paper presented at the Annual Meeting of the 
American Educational Research Association (New 
Orleans, LA, April 4-8, 1994). 
Reports - Evaluative/Feasibility (142) — 
Speeches/Conference Papers (150) 

MFOl/PCOl Plus Postage. 

Access to Education; "Accountability; ^^Educational 
Assessment; Educational History; Educational Policy; 
Educational Practices; *Equal Education; Foreign 
Countries; Outcomes of Education; *Policy Formation; 
Scoring; *Test Bias; Test Construction; Testing 
Problems; Test Use 

High Stakes Tests; National Curriculum; ''Tcrf orroance 
Based Evaluation; '^United Kingdom 



ABSTRACT 

The United Kingdom has a history of performance 
assessment even for accountability purposes, as the public 
examinations (standardized achievement tests) at age 16 demonstrate. 
What the country does not have is a strong history in the area of 
equity. Debate and policy-making, when concerned at all, have been 
concentrated on equality of opportunity, but there has been 
relatively little interest in equality of outcome. Equity does not 
imply equality of outcome and cannot presume identical experiences 
for all. Both are unrealistic. Equity in assessment rather implies 
that assessment practice and interpretation of results are fair and 
just for all groups. Experience in the United Kingdom with 
performance assessment suggests that high stakes performance 
assessment can change curriculum focus and broaden teaching. It is 
possible to use performance assessment for accountability and 
certification purposes. Problems do arise, and some of these are 
discussed in the context of assessment pertaining to the National 
Curriculum. Although there is no such thing as a perfectly fair test, 
paying attention to assessment administration and scoring can make 
teste* more fair. Although equality of outcome is not possible, 
genuine quality of access is a necessary goal. (Contains 22 
references.) (SLD) 



* * A Vc ?V ******* **V? * ****************************************************** 

* Reproductions supplied by EDRS are the best that can be irade * 

* from the original document. ''^ 

*************************** **vc**-;.************************************** 



AERA Conference 1994 



U.S. DCPAIVTMCNT OF COUCATlON 
OltKM & EduCltC)n«l ReMtarCh and lrnD«owrment 

EDUCATIONAL RESOURCES INFORMATION 
T CENTER (ERiO 

a^i« document hj» be«n retxoduced as 

O Mtnof ch***©** t»eeo fn*de to fmcwove 
r«produCi>on Ouaiity 



Po«nt» o» o» opinions »tate<S»o ini»aocu- 
men! <jo not neceM*"iy represent official 

OER» POtifiOn or OOl'CV 



New Orleans 



••PERMISSION TO REPRODUCE THli 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERiC) ' 



What Do We Mean By Equity 
in Relation to Assessment? 



Dr Caroline Gipps 
University of London 
Institute of Education 
Curriculum Studies Department 
20 Bedford Way 
London WCIH OAL 



paper presented as part of the symposium 
Equity Issues in Performance Assessment 




2 



What do we mean by equity in relation to assessment? 



Introduction 

The UK has a history' of Performance Assessment even for accountability purposes: 
the public examination at 16, the GCSE, involves written responses of the short 
answer and essay type, practical assessment, oral assessments, and extend pieces of 
coursework which are assessed in school by the teachers. All marking has a 
judgmental element and is done centrally by spedally-trained teachers; it is then 
moderated either statistically or through inspection. Multiple choice testing, though 
it exists in the UK, has never been widely used and is not considered to be an 
appropriate basis for high-status examinations. At the other end of the age scale is 
the National Curriculum Assessment program for seven year olds; this involves 
teachers' own assessments of pupil attainment (TA or Teacher Assessment), two 
standardised tests, and some performance-based assessment tasks (STs or Standard 
Tasks). The problems and impact of this latter assessment program, together with 
the difficulties in developing and implementing it, were reported at this conference 
last year and the year before. (Gipps 1992, 1993). 

Where we do not have a strong history is in the area of equity. Debate and policy 
making where it has featured at all has referred to equal opportimities in education 
with a brief excursion in to compensatory education for disadvantaged groups. 
Early attempts to achieve equality of opporturuty, for girls and boys, focused in the 
main on equality of resources and access to curriculum offerings; but this we now 
see as a naive approach to social equality given the very different out-of-school 
experiences of girls and boys. The fundamental problem is that this policy focus 
reflects a deficit model approach to inequality: girls are 'blamed* for behaving like 
girls and encouraged to behave more like boys. This model implies the possibility of 
overcoming disadvantage through the acquisition of what is lacking. This approach 
leaves the status quo essentially unchanged since girls are unlikely to achieve parity 
through equality of resources and formal equality of access alone. As Yates puts it 
where the criteria of success and the norms of teaching and curriculum are still 
defined in terms of the already dominant group, that group is always likely to 
remain one step ahead.' (Yates 1985 p. 212). Equal opportunities is a policy area 
which has been hotly contested in the UK: it is seen by the extreme right as a 
revolutionary device which would disturb the 'natural' social order and as an 
attempt to attack White British society, and by the extreme left as essentially 
conservative because the gross disparities in wealth, power and status which 
characterise our society remain imchallenged. 

A second approach is one which looks for equality of outcome (as evidence of equal 
opportunities) and this underpins analyses and discussions of group performance at 
public examination level in the UK. The attitude to equity in the USA is very 
different from that in the UK, for reasons of history and because of the population 
structure: The US has a long-term commitment to equity for its wholly immigrant 
population' Baker and 0*Neil (1994) p. 3 manuscript) and is evidenced in equal 
outcome terms: "The term equity is used principally to describe fair educational 
access for all students; more recent judicial interpretations, however, have begun the 
redefinition of equity to move toward the attainment of reasonably equal group 



er!c 



2 

3 



outcomes" (Baker and O'Neil 1994 p. 2 of manuscript) "... the educational equity 
principle should result in students receiving comparable education yielding 
comparable performances." (p. 4 op cit) 

In the UK equal opportunities has come to be defined as *open competition for scarce 
resources' (Wood 1987) in late 1980/early 1990s. The notion of competition is, 
however, antithetical to equal outcomes: in a competition the best person wins the 
prize; competition is not designed to offer each individual the best outcome possible 
for them. In terms of education the latter is, of course, what we seek (while 
accepting that for some highly selective purposes identifying the 'best' individuals is 
necessary). Indeed *fair' competition requires actual equal opp>ortunities and a 
specification of the rules of the game so that all participants are equally well- 
prepared. 

Apple's (1989) review of public policy in the USA, Britain and Australia leads him to 
conclude that equality has been redefined: it is no longer linked to group oppression 
and disadvantage but is concerned to ensure individual choice within a 'free market' 
perception of the educational commimity. In Apple's view this redefinition has re- 
instated the disadvantage model and underachievement is once again the 
responsibility of the individual rather than the educational institution. 

He argues that attention in the equity and education debate must be refocused on 
important curricular questions, to which we add assessment questions, in the table 
below: 



Table 1 Curriculum and Assessment Questions in Relation to Equity ^ 



Curricular Questions 

Whose knowledge is taught? 



Assessment Questions 

What knowledge is assessed and 
equated with achievement? 



Why is it taught in a particular 
way to this particular group? 



How do we enable the histories 
and cultures of people of colour, 
and of women, to be taught in 
responsible and responsive ways? 



Are the form, content and mode of 
assessment appropriate for different 
groups and individuals? 

Is this range of cultural knowledge 
reflected in definitions of 
achievement? F>w does cultural 
knowledge mec .:*te individuals' 
responses to assessments in ways 
which alter the construct being assessed? 



Despite a lack of consensus there seems to be a general understanding that formal 
equality of opportunity is not sufficient to ensure fairness. Our view is that while 
one must strive for actual equality of opportunity, equality of outcomes is not an 
appropriate goal. The focus on equality of outcomes is, we feel, unsound, because 




^ from Gipps and Murphy 1994 (and after Apple 1989) 

3 

4 



different groups may indeed have different qualities and abilities and certainly 
experiences: manipulating test items and procedures in order to produce equal 
outcomes may be doing violence to the construct or skill being assessed and 
certainly camouflaging genuine group differences. We use therefore, given the 
contested nature of the equal opjx)rturuties concept in the UK, the concept of equity 
which is defined in the dictionary as moral justice, or the spirit of justice. 

Equity, in our view, does not imply equality of outcomes and does not presume 
identical experiences for all: both of these are unrealistic. The concept of equity in 
assessment as we use it implies that assessment practice and interpretation of results 
are fair and just for all groups. Our focus on equity in relation to assessment 
considers, therefore, not only the practices of assessment, but also the definition of 
achievement, whilst at the same time recognising that other factors, eg pupil 
motivation and esteem, teacher behaviour and expectation also come into play in 
determining achievement- 
Equity and Assessment 

It is important to remember that 'objective' assessment has traditionally been seen as an 
instrument of equity: the notion of the standard test as a way of offering impartial 
assessment is of course a powerful one, though if equality of educational opporturuty 
does not precede the test, then the 'fairness' of this approach is called into question. 
Most tests and examinations are amenable to coaching and pupils who have very 
different school experiences are not equally prepared to compete in the same test 
situation. 

As Madaus (1992) points out 

"... in addressing the equity of alternative assessments in a high-stakes policy-driven 
exam system policy must be crafted that creates first and foremost a level playing 
field for students and schools. Only then can the claim be made that a national 
examination system is an equitable technology for making decisions about 
individuals, schools or districts", (p. 32) 

The same point is also made by Baker & O'Neil (1994). 

The traditional psychometric approach to testing operates on the assumption that 
technical solutions can be found to solve problems of equity with the emphasis on using 
elaborate techniques to eliminate biased items (Murphy 1990; Goldstein 1993). The 
limitation of this approach is that it does not look at the way in which the subject is 
defined (i.e. the overall domain from which test items are to be chosen), nor at the initial 
choice of items from the thus-defined pool, nor does it question what counts as 
achievement. It simply 'tinkers' with an established selection of items. Focusing on bias 
in tests, and statistical techniques for eliminating 'biased' items, not only confounds the 
construct being assessed, but has distracted attention from wider equity issues such as 
actual equality of access to learning, 'biased' curriculum, and inhibiting classroom 
practices. 



Bias in relation to assessment is generally taken to mean that the assessment is unfair to 
one particular group or another. This rather simple defirution however belies the 
complexity of the underlying situation. Differential performance on a test, i.e. where 
different groups get different score levels, may not be the result of bias in the 
assessment; it may be due to real differences in performance among groups which may 
in turn be due to differing access to learning, or it may be due to real differences in the 
group's attairunent in the topic under consideration The question of whether a test is 
biased or whether the group in question has a different underlying level of attainment is 
actually extremely difficult to answer. Wood (1987) describes these different factors as 
the opportunity to acquire talent (access issues) and the opportunity to show talent to 
good effect (fairness in the assessment). 

When the existence of group differences in average performance on tests is taken to 
mean that the tests are biased, the assumption is that one group is not inherently less 
able than the other. However, the two groups may well have been subject to different 
environmental experiences or unequal access to the curriculum. This difference will be 
reflected in average test scores, but a test that reflecis such unequal opportunity in its 
scores is not strictiy speaking biased, though its use could be invalid. 

Hence to achieve equity in assessment interpretations of students' performance should 
be set in the explicit context of what is or is not being valued: an explicit account of the 
constructs being assessed and of the criteria for assessment will at least make the 
persp>ective and values of the test developer open to teachers and pupils. A 
considerable amount of effort over the years has gone into exploring cognitive deficits 
in girls in order to explain their poor performance on science tests; it was not until 
relatively recentiy that the question was asked whether the reliance on tasks and 
apparatus associated with middle class white males could possibly have sometliing to 
do with it. As Goldstein (1993) points out, bias is built into the test developers' 
construct of the subject and their expectations of differential performance. 

Construct validity is the key to developing good quality assessment and we need to 
look at this not just in relation to the subject but also from the point of view of the pupil 
being assessed. Group moderation among teachers has the potential for focusing on the 
construct being assessed, initiating discussion about how the meaning of a task is 
construed by teachers and hopefully by pupils (Gipps 1994). 

Performance assessment cannot be developed using traditional psychometric 
techniques for analysing items, because far fewer items are involved and there is no 
assumed underlying score distribution. This may force a shift towards other ways of 
reviewing and evaluating items based on qualitative approaches, for example 
sensitivity review, a consideration of the construct, how the task might interact with 
experience etc.; such a development is to be welcomed. 

Pupils do not come to school with identical experiences and they do not have identical 
experiences at school. We cannot, therefore, expect assessments to have the same 
meaning for all pupils. What we must aim for, though, is an equitable approach where 
the concerns, contexts and approaches of one group do not dominate. This however is 
by no means a simple task; for example test develop)ers are told that they should avoid 
context which may be more familiar to males than females or to the dominant culture. 



ERIC 



5 

6 



But there are problems inherent in trying to remove context effects by doing away with 
passages that advantage males or females, because it reduces the amount of assessment 
material available, De-contextualised assessment is anyway not possible, and complex 
reasoning processes require drawing on complex domain knowledge. Again, dear 
explanation of the constructs and contexts are important. 

In an assessment which looks for best rather than typical performance the context of the 
item should be the one which allows the pupil to perform well but this suggests 
different tasks for different groups which is in itself hugely problematic. However, 
what we can seek is the use, within any assessment programme, of a range of 
assessment tasKS involving a variety of contexts, a range of modes within the 
assessment, and a range of response format and style. This broadening of approach is 
most likely to offer pupils alternative opportunities to demonstrate achievement if they 
are disadvantaged by any one particular assessment in the programme. 

Indeed this is included in the Criteria for Evaluation of Student Assessment Systems 
(NFA 1992): 

to ensure fairness, students should have multiple opportunities to meet standards 
and should be able to met them in different ways" 

" - assessment information should be accompanied by information about access to 
the curriculum and about opportunities to meet the Standards ^ 
" - ... assessment results should be one part of a system of multiple indicators of the 
quality of education." (NFA 1992 p. 32) 

Performance Assessment 

I take performance assessment to mean assessments which: model the authentic task 
ie require the pupil to perform in the assessment what we wish them to learn in the 
classroom; usually focus on higher levels of cognitive complexity; require the 
production of a response (in a range of modes); and require qualitative judgements 
to be used in the marking. 

The difference in the UK and USA settings in relation to performance assessment 
should not be imderestimated: as explained in the Introduction assessment in the 
UK is predominantly on the PA model. However, we are not using PA as a basis for 
bringing about educational reform. Furthermore, within PA we can list a range of 
approaches from the high status written examination, the Standard Tests and Tasks 
of the National Curriculum assessment program, teacher assessed coursework in 
GCSE, to portfolios and Records of Achievement. As PA is rather less-well 
developed in the US there is a tendency to use the generic term: hence the concern 
among minority groups that 'alternative equals non-standard equals sub-standard' 
(Baker and O'Neil, 1994). In the UK, I should point out, not all PAs are considered 
equal: there is a world of difference between the public examination and the Record 
of Achievement and this is reflected in the status of these assessments: a Record of 
Achievement would not the considered sufficiently external and rigorous for 
selection and accountability purposes. The amount of information provided is also 
an issue: a percentage mark, grade or level is easier to use (for anything other than 
teaching purposes) than is the more qualitative, descriptive information from RoAs. 



ERIC 



6 

7 



It is important therefore to distinguish between say portfolio assessment and 
specified PA tasks whic±i are set for trained teac±iers and marked by them, against 
specified criteria using agreed marking systems, with the system underpinned by 
moderation. These differences in approach will have significant effects on 
consistency of approach and scoring, affecting the construct assessed and inter-rater 
reliability, both of which are highly pertinent to equity issues, particularly if the 
assessments are used for high-stakes purposes. 

The question which seems to be being addressed in the US is, is PA a good form of 
assessment? This question has however to be deconstructed into: a good form of 
assessment for what purpose? and better than which other forms of assessment? 
Our experience in the UK would suggest that high stakes PA can change curriculum 
focus and broaden teaching: this happened as a result of the introduction of the 
GCSE at 16 (HMI 1988) and to a certain extent at age 7 with NC assessment (Gipps et 
al 1992). The strain on teachers which this sort of change brings should not be 
underestimated, however, nor indeed their need for in-service training in the subject 
area and the new assessment model. 

All forms of PA support school-based assessment and formative assessment better 
than standardised tests can - because of their flexibility and potential to assess 
constructs in more depth. Furthermore, it is possible to use a highly structured, 
externally marked and moderated PA program for accountability and certification 
purposes; the resources required to support such a program are significant, 
however, and the English experience is that it is manageable at only one or two 
points in the system (indeed there should be no need for more than this). 

Performance Assessment and Equity Issues in National Curriculum Assessment 

The National Assessment program requires that pupils are assessed across the full 
range of the National Curriculum using external tests and teachers' own assessment. 
The external tests were originally called Standard Assessment Tasks (SATs) and the 
teachers' assessments are called Teacher Assessment (TA). The original SATs used 
in 1991 and 1992 were true performance assessments and involved classroom based, 
externally-set, teacher assessed activities and tasks. For example, at age seven 
reading was assessed by the child reading aloud a short passage form a children's 
book chosen from a list of popular titles, using dice to play maths games, using 
objects to sort etc. At age 14 there were extended investigative projects in maths and 
science and assessments based on classroom work in English. What they had in 
common, across both ages, as well as the performance element is classroom, rather 
than formal examination, administration; interaction between teacher and pupil with 
ample opportunity to explain the task; and a reduced emphasis on written responses 
- particularly at age seven. 

The role of communication in PA, involving spoken and written responses together 
with understanding of the instructions for the task, presents a significant threat to 
equity for minority language groups. 



er!c 



7 

8 



Evidence from the piloting of the SATs for seven year olds in 1990 indicated that the 
bilingual children seemed more insecure initially when presented with new work in 
the SATs; when this was the case the peer group became a very important source of 
support. In an assessment situation, however, this posed difficulties for the teacher 
in deciding whether the intervention of another pupil had clarified the child's 
understanding of the question or supplied the corre.: : answer. The 
misunderstanding of instructions was a serious problem for bilingual pupils: they 
appeared to relaxr and respond better when questions were rephrased in the mother 
tongue; they became more motivated and handled tasks more confidently. When 
activities were lengthy and complex there was a particular burden on bilingual 
children and examples of misimderstanding did not always come to light. Teachers 
felt that the bilingual children found it particularly difficiUt to show their true ability 
in maths and science. This was largely due to the difficulty of assessing oral 
responses in science interviews and the difficulties these children experienced in the 
group discussion element of science and maths investigations. 
(NFER/BGC 1991) 

The teachers also reported a hazard in small group testing: where children worked 
in mixed groups for assessment the boys were sometimes more dominant and girls 
took a passive role, a commonly observed pattern of gendered performance. 

In our study of national assessment in primary schools^, teachers reported strong 
feelings that the national assessment programme at age seven was inevitably unjust 
for bi4ingual children (Gipps, 1992). Since their English language skills were still in the 
early stages of development these chUdren were disadvantaged in any assessment that 
was carried out for comparative purposes. These teachers felt that formal summative 
(or accoimtability) assessment for comparison is, at this age, unfair for such children 
and thus runs coimter to their notion of equity. These teachers had similar views about 
children from disadvantaged backgrounds but their feelings about bi-lingual learners 
were particularly strong. 

That said there was a feeling, however, that the SATs for all their difficulties of 
classroom management, time and unmanageability, and despite their heavy reliance 
on language, did offer a better assessment opportimity for children with special 
needs and bilingual learners, than would more traditional paper and pencil tests. 
Our teachers' views were that, whatever the level of disadvantage for their bilingual 
learners in summer 1991 - and this was where their anxieties lay, not with gender 
issues - that this would be increased in the more formal testing which they 
anticipated in summer 1992. 

Furthermore, children who were second language learners tended to perform less 
well than other pupils on the SATs, but there was some evidence that they 
performed better on the SATs than in the TA, and this was a fairly widespread 
finding. This suggests that structured PA was better for minority pupils than 
(unmoderated) Teacher Assessment (and one could deduce, better than nothing) 
since in effect, teacher stereotype was being challenged. 



National As sessment in Primary School: an evaluarion 
ESRC Grant No. R 000 23 21 92 



8 



In the piloting of SATs for 14 year olds in 1991 there was a detailed investigation of 
teac±iers* views in relation to these r^ssessment tasks in maths (CATS 1991) but not in 
other subjects. The teachers administering the SATs felt that the nature of the SAT 
rendered it accessible to pupils who were not fluent in English. Aspects which 
contributed to this included: interaction with the teacher, the practical elements of the 
tasks, a normal classroom atmosphere, interactions with other pupils and the variety of 
presentation and assessment modes. The conclusion made is that for pupils who were 
not fluent in English, written materials cannot enable the demonstration of attainment 
without teacher-pupil interaction. Most of these teachers felt that pupils who were not 
fluent in English could engage in the SAT activity. Thus the style of the activity was 
appropriate for most of these pupils. However, only a third of teachers thought that the 
SAT enabled pupils to demonstrate appropriate attainment. This comment no doubt is 
related to the fact that overall the attainment of non-fluent pupils was below that of 
others in both the SAT and the TA. However, analysis of the SATs showed that pupils 
who were not fluent were scoring higher on the SAT than in the TA which suggests that 
the TA awarded by the teachers may have been an underestimate due to the pupils* 
perceived language difficulties, and that the SATs facilitated high performance for non- 
fluent pupils to a greater extent than it did for others. The pilot report of the maths 
scheme states that 'if pupils who are not fluent in English are to be entitled to a fair 
assessment it is essential that the SATs retain the interactive, practical and flexible 
aspects.' (CATS 1991) 

The report from the Schools Examination and Assessment Countil (SEAC) which draws 
together the KS3 findings in the 1991 pilot (SEAC, 1991) from the various agencies does 
not discuss performance by ethnic group (presumably because the sample sizes were 
small) but points out that for bilingual learners performance was "relatively high". This 
they suspect is because teachers were able to provide the normal classroom support for 
these pupils during the SATs, that the materials were generally accessible to pupils 
whose home language is not English and that the opportunity for these pupils to 
demonstrate their full potential if the SATs were to change to timed written tests will be 
reduced (as will be the case for pupils with special educational needs). 

An important point emerges from these National Curriculum Assessment studies, at 
both age seven and fourteen: The SAT-type activity with its emphasis on active, multi- 
mode assessment and detailed interaction between teacher and pupil may, despite the 
heavy reliance on language, be a better opportunity for minority and special needs 
children to demonstrate what they know and can do than traditional, formal tests with 
limited instructions. The key aspects seem to be: 

- a range of activities, 

- match to classroom practice 

- extended interaction between pupil and teacher to explain the task 

- normal classroom setting 

- a range of response modes other than written. 

Furthermore, many of these pupils performed better on the SAT than in the TA and this 
made the teachers think hard about their evaluation of the pupils. 

This point is made quite strongly in the SEAC review of the KS3 pilots: "The relative 
success of the 1991 SATS for children with special educational needs and English as a 




10 



second language will be reduced in 1992, unless procedures are established to allow the 
1992 tests to be used flexibly with these pupils"; and "The differential performance of 
boys and girls is likely to be affected by the change to a largely written mode of 
assessment It is important that this is monitored*', (p. 2 SEAC 1991). 

Conclusion 

There is no such thing as a fair test, nor could there be: the situation is too complex 
and the notion simplistic. However, by paying attention to what we know about 
factors in assessment and their administration and scoring we can begin to work 
towards tests that are more fair to all the groups likely to be taking them, and this is 
particularly important for assessment used for summative and accoimtability 
purposes. 

One reason why we cannot look for fair tests is that we cannot assume identical 
experiences for all. This is also why we do not look for equal outcomes; for this we 
would need assessment tailored to different groups. We do argue, therefore, that on 
the grounds of equity all groups be offered actual equality of access to the 
curriculum and that the exams and assessments are as fair as possible to all groups. 

So how do we ensure that assessment practice and interpretation of results is as fair 
as possible for all groups? It is likely that a wide ranging review of s/Uabus content, 
teacher attitude to both boys and girls, assessment mode and item format is 
required, as the Table on page 3 shows, if we wish to make assessment as fair as 
possible to both genders. Although this is a majo* task, it is one which must be 
addressed in the developing context of national standards, national curriculum, and 
national assessment. As an example that it is possible, we offer the case of Physics 
Higher School Certificate in South Australia; girls' performance in physics has 
improved since 1988 when a female chief examiner was appointed. This examiner 
has deliberately worked within a particular model of physics (which takes a 'whole 
view' of the subject); simplified the language of the questions; included contexts only 
that are integral to particular physics problems; offered a range of different ways of 
answering questions which does not privilege one form of answer over another; 
provided a list of key instruction words and how students would go about 
answering questions which include these words (ESSSA 1992). 

We need to encourage clearer articulation of the test/exam developers' construct on 
which the assessment is based, so that the construct validity may be examined by test 
takers and users. Test developers need to give a justification for inclusion of context 
and types of response mode in relation to the evidence we have about how this interacts 
with gender and curriculum experience. The ethics of assessment demand that the 
constructs and assessment criteria are made available to pupils and teachers, and that a 
range of tasks and assessments be included in an assessment programme. These 
requirements are consonant with enhancing construct validity in any case. Given the 
detailed, and as yet poorly understood, effect of context (Murphy 1993) on performance, 
the evidence that girls attend to context in an assessment task more than do boys, and 
the ability of changes in the context of the task to alter the construct being assessed this 
is an area of validity which demands detailed study. We certainly need to define the 




10 



n 



context of an assessment task and the underlying constructs and make sure they reflect 
what is taught. 

We must encourage the use of a range of modes and task style; we need also to 
expand the range of indicators used: 

"Multiple indicators are essential so that those who are disadvantaged on one 
assessment have an opportunity to offer alternative evidence of their 
expertise." 

( Linn, 1992, p.44) 

Assessment which elicits an individual's best performance involves tasks that are 
concrete and within the experience of the pupil (an equal access issue) presented clearly 
(the pupil must understand what is required of her if she is to perform well) relevant to 
the current concerns of the pupil (to engender motivation and engagement) and in 
conditions that are not threatening (to reduce stress and enhance performance) (after 
Nuttall, 1987). 

Although we do not look for equality of outcome, we must continue to seek genuine 
equality of access; this means that all courses, subjects studied, examinations etc are 
actually equally available to all groups and are presented in such a way that all 
groups feel able to participate fully. One suggestion from the United States is that, 
since opportunity to learn is a key factor in performance, schools may have to 'certify 
delivery standards' as part of a system for monitoring instructional experiences 
(Linn, 1993). How realistic it is to do this remains to be seen, but it does put the onus 
on schools to address the issue of equal access, at an actual rather than formal level. 

We need to be clear about what counts as proper preparation of pupils in any 
assessment programme. If there are preparation practices which are considered to be 
unethical then they should be spelled out. The other side of the coin is that teachers and 
schools have a cominitment to teach pupils the material on which they are going to be 
formally assessed. To this requirement we should add proper preparation of teachers 
so that they understand the basic issues in assessment and are equipped to carry out 
good formative assessment. 

Finally, a point made by one of the reviewers of this symposium was that the 
fundamental issue in equity and assessment is: are group differences in measures 
'rear or are they the result of the measuring system? This is, of course, the $64,000 
question, but the answer is likely to be: 'a bit of both' and what we need to do is to 
minimise the latter, while understanding and articulating causes of the former. 
Another reviev/er commented: is there another approach to PA which can avoid the 
irresolvable issue of equity. I believe we have shown that aspects of the SATs which 
allowed greater explanation of the task, reduced the emphasis on written response, 
and reduced stress levels (because both the task and the setting were familiar) are 
crucially important. What is interesting to us is that there is concern about equity 
issues in the USA in the shift from standardised tests to a PA model, while in the UK 
we are concerned about equity issues in the shift from PA to a more standardised 
model. 



ERIC 



11 
12 



References 

Apple M W (1989) 'How Equality has been redefined in the Conservative Restoration' in 
Secada W (Ed) Equity and Education New York: The Palmer Press 

Baker E and O'Neil H (1994) Performance Assessment and Equity: A view from the 
USA . Assessment in Education, Vol 1, No 1 (in press) 

CATS (1991) Pilot Report 1991 Key Stage 3 Maths, SEAC. 

ESSSA (1992) Gender Equity in Senior Secondary School Assessment Project : Third 
Progress Report September 1992, Senior Secondary Assessment Board of South 
Australia. 

Gipps C (1992) National Testing at Seven: What can it tell us?. Paper presented at AERA 
Corierence 1992, San Francisco 

Gipps C (1993) R gljgbility Vgliijity gnd MfiP9 ge9 ] ?ilit y in Large Scale Pgrfprmgnce 
Assessment, Paper presented at AERA Conference, Atlanta, 1993 

Gipps C (1994) Beyond Testing: towards a theory of educational assessment . Palmer 
(in press) 

Gipps C and Murphy P (1994) A Fair Test? Assessment, Achievement and Equity 
Open University Press 

Gipps C, McCallum B, McAlister S and Brown M (1992) National Assessment at Seven: 
some emerging themes, in C Gipps (Ed) Develop in g Assessment for the National 
Curriculum . Bedford Way Series, ULIE/Kogan Page 

Goldstein H (1993) 'Assessing Group Differences' Oxford Review of Education Vol 19 
No 2, 141-150 

HMI (1988) The Introduction of the GCSK in Schools 1986-1988. London: HMSO 
Linn M C (1992) Gender Differences in Educational Achievement in Sex Equity in 

EdtiC9tipn9l Oppprtmmty r Achievement and Testing E.T.S. 

Linn R L (1993) 'Educational Assessment: Expanded Expectations and Challenges' 
Educational Evaluation and Policy Analysis , Vol 15 No 1 

Madaus G (1992) A Technological and Histoi.^al Consideration of Equity Issues 
Associated with Proposals to Change the Nation's Testing Policy. Paper presented at 
symposium on Equity and Educational Testing and Assessment March 1992, 
Washington DC 

Murphy P (1990) Gender Differences - Implications for Assessment and Curriculum 
Planning . Paper to BERA Conference 1990 




Murphy P (1993) Some teacher dilemmas in practising authentic assessment, paper 
presented to AERA Annual Meeting, Atlanta, 1993 

National Forum on Assessment (1992) 'Criteria for Evaluation of Student Assessment 
Systems' Educational Measurement: Issues and Practice Spring 1992, 32 

NFER/BGC (1991) An Evaluation of National Curriculum Assessment. Reports. June 
1991 

Nuttall D (1987) 'The Validity of Assessments' European Toumal of Psychology of 
Education Vol 11 No 2, 109-118 

SEAC (1991^ National Curriculum Assessment at Key Stage 3: a review of the 1991 
pilots with implications for 1992. EMU: SEAC 

Wood R (1987) ' Assessment and Equal Opportunities '. Text of Public lecture at ULIE (11 
November 1987). 

Yates L (1985) Is 'girl friendly schooling' really what girls need? in Whyte J, Deem R, 
Kant L & Cruikshank M (eds) Girl Friendly Schooling . London Methuen. 



14 



S. Dept. of Education 



Office of Educational 
Research and Improvement (OERI) 



E 




Date Filmed 
October 21, 1994 



National Council on Measurement in Education 
April 1994 




as. DEPARTMENT OF EDUCATION 
Otflc0 of Educmtlonml R»S9Mrch and Improvmmant (OERi) 
Educatlonml R0sourc0S Intormmtton Center (ERIC) 

REPRODUCTION RELEASE 

(Specific Document) 



DOCUMENT IDENTIFICATION: 



Title. 



Author(S) 



Corporale Source 



PuDlication Date 



II. REPRODUCTION RELEASE: 

In order to disseminate as widely as possible timely and significant materials o! interest to tne educational community, documents 
announced m the monthly abstract ;ournai of the ERIC system, flesources »n fduca/'on (RIE). are usually made available to users 
in microfiche, reproduced paper copy, and electronic/optical media, and sold through the ERIC Document Reproduction Service 
(EDRS) or other ERIC vendors Credit is given lo the source of each document, and. if reproduction release is granted, one ot 
the following notices is affixed to the document 

If permission is granted to reproduce the identified document, please CHECK ONE of the following options and sign the release 
below 



□ 



Sample sticker to be affixed to document Sample sticker to be affixed to documerit 



Check here 

Permitting 

microfiche 

(4-x 6" film). 

paper copy. 

electronic. 

and optics I media 

reproduction 



-PERI^ISSIONI TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)." 



"PERMISSION TO REPRODUCE THIS 
MATERIAL IN OTHER THAN PAPER 
COPY HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)" 



or here 

Permitting 
reproduction 
tn oiher than 
paper copy. 



Uvel 1 



Sign Here, Please 



Documents will be processed as indicated provided reproduction quality permits if permission to reproduce is granted, but 
neither box ts checked, documents will be processed at Level \ 



"\ hereby grant to the Educational Resources Information Center (ERIC) nonexclusive permission to reproduce this document as 
indicated above. Reproduction from the ERIC microfiche or electronic/optical media by persons other than ERIC employees and its 
system contractors requires permission from the copyright holder. Exception is made for non profit reproduction by libraries and other 
service agencies to satisfy information needs of educators in response to discrete inquiries." 



Signature: 




Position; 



Printed Name: 



Organization: ' a — c I ^ 




Address: 



2.0 fe^^CD w3jV-' 



Telephone Number 



( 



Date: 



CUA 



THE CATHOLIC UNIVERSITY OF AMERICA 

Deparmem of Education, O 'Boyle Hall 
Washington, DC 20064 
202 319-5120 



March 1994 



Dear NCME Presenter, 



Congratualations on being a presenter at NCME. The ERIC Clearinghouse on Assessment 
and Evaluation would like you to contribute to ERIC by providing us with a written copy of 
your presentation. Submitting your paper to ERIC ensures a wider audience by making it 
available to members of the education community who could not attend the session or this 
year's conference. 

Abstracts of papers that are accepted by ERIC appear in RIE and are announced to over 
5,000 organizations. The inclusion of your work makes it readily available to other 
researchers, provides a permanent archive, and enhances the quality of RIE. Your 
contribution will be accessible through the printed and electronic versions of RIE, through 
the microfiche collections that are housed at libraries around the country and the world, and 
through the ERIC Document Reproduction Service. 

We are gathering all the papers from the NCME Conference. We will route your paper to 
the appropriate clearinghouse and you will be notified if your paper meets ERIC's criteria. 
Documents are reviewed for contribution to education, timeliness, relevance, methodology, 
effectiveness of presentation, and reproduction quality. 

To disseminate your work through ERIC, you need to sign the reproduction release form on 
the back of this letter and include it with two copies of your paper. You can drop of the 
copies of your paper and reproduction release form at the ERIC booth (#227) or mail to our 
attention at the address below. Please feel free to copy the form for future or additional 
submissions. 




Mail to: 



Sincerely, 



NCME 1994/ERIC Acquisitions 
The Catholic University of America 
O'Boyle Hall, Room 210 
Washington, DC 20064 




Lawrence M. Rudner, Ph.D. 
Director, ERIC/AE 



Clearinghouse on Assessment and Evaluation 



